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1. LANCASTER PROBABILITIES AS THE 
PROPER FRAMEWORK 

It is a pleasure to congratulate the authors for 
this excellent, original and pedagogical paper. I read 
a preliminary draft at the end of 2006 and I then 
mentioned to the authors that their work should be 
set within the framework of Lancaster probabilities, 
a remoted corner of the theory of probability, now 
described in their Section 6.1. The reader is referred 
to Lancaster (1958, 1963, 1975) and the synthesis 
by Koudou (1995, 1996) for more details. 

Given probabilities fj,{dx) and fidy) on spaces X 
and y, and given orthonormal bases p = (pnix)) and 
q = {qn{y)) of L'^in) and L'^{i^), a probability a on 
X X y is said to be of the Lancaster type if either 
there exists a sequence p = (pn) in £^ such that 



a{dx, dy) 



^PnPn{x)qn{y) 



fi{dx)i^{dy) 



or (T is a weak limit of such probabilities. Alter- 
natively, one can say that the sequence of signed 
measures [En=o PnPnix)qniy)]pidx)u{dy) converges 
weakly toward the probability a when N ^ oo (here 
p does not need to be in i'^). An acceptable se- 
quence p = (pn) is called a Lancaster sequence for 
the quadruple {p, u,p, q). If po = 90 = 1 the margins 
of fj are (/-f,i^)- Writing 

a{dx, dy) = p{dx)K{x, dy) = v{dy)L{y, dx) 

the probability kernel of the "x-chain" considered in 
the paper is 



/c(x, dx) 



y 



K{x,dy)L{y,dx') 
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plPn{x)Pn{x') 



p{dx') 



which clearly shows that pn is an eigenfunction for 
the eigenvalue p'^ of the operator Tf{x) = fi^') • 
k{x, dx'). 

I will not comment here on the multivariate case 
X =M.^ and y = M™. Everything which is known 
about Lancaster probabilities and which is specific 
to this case is mentioned in Section 7 of the pa- 
per. To my knowledge, the Lancaster probabilities 
on the torus (M/Z)^ associated to the trigonometric 
orthonormal polynomials have never been consid- 
ered. 

For the present time, the richest case is obviously 
the one where X = y = M. and where p = {pn) and 
q = (qn) are the orthonormal polynomials obtained 
by the Schmidt orthonormalization process in (p) 
and Lp'^iy) applied to the sequences (x") and (y"'), 
assuming furthermore that / e"l^l/x(dx) and / e'^^'^^v{dy) 
are finite for some a > 0. In the sequel, the term 
"Lancaster probabilities" will refer only to this real 
case. The following should be specified clearly: 

Saying that conditions HI, H2 and 113 of 
Section 3 are all fulfilled is equivalent to 
saying that P{dx,d9) is a Lancaster prob- 
ability. 

An elegant example can be found in Buja (1990, 
page 1049) with 

a + b 



a{dx, dy) 



n-l„,b-l 



y lA{x,y)dxdy 



B{a,b) 

where a,b > and A = {{x,y); < x,y; x + y < 
1}. The margins are p{dx) = (3a,b+i{dx) and I'idy) = 
Pb,a+i{dy) and the Lancaster sequence is 



Pn ■ 



-l)''Vab 



v/(a + n)(6 + n) 

The present paper on discussion is based on three 
observations. The first one is crucial: the two-compo- 
nents Gibbs sampler is very easy to perform with a 
Lancaster probability. This is the statement in The- 
orem 3.1. Parts a and b are well known but part c 
is elegant and surprizing. 
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2. NATURAL EXPONENTIAL FAMILIES 

In order to explain the other two observations, 
let us introduce some notation: a (not necessarily 
bounded) positive measure on M is said to be in 
A^(M) if it is not concentrated on one point and if 
its Laplace transform 

J —oo 

is such that the interior 0(/i) of the interval D{fi) = 
{9 G M; L^{9) < oo} is not empty. To such a, fi £ 
A^(M) one associates the one-dimensional natural 
exponential family (NEF): 

F = Fip) 

= {P{fi,9){dx) = e^^-'=''(^V(d2;); ^ G e(/i)}. 

Since 9 i— > is strictly convex on 0(/i) the map 

9 ^ m = k'{9) = J^^xP{fi,9){dx) is injective and 
its image Mp = k'^[Q{^)) is an open interval called 
the domain of the means of F. One denotes by m i— > 
9 = ij)fj_{m) the inverse map from Mp to O(^). Fi- 
nally we say F or /i is steep if Mp is the interior 
of the convex support of /i. For instance D[ix) = 
0(/_f) (in this case F is said to be regular) implies 
that F is steep. The converse is not true. Diaconis 
and Ylvisaker (1979) show that if F is regular, if 
xq G Mp and if A > then there exists a constant 
C(xo, A) such that 

7r.o,A(f^^) = C(xo,A)e^(^^°-'=''W)le(;.)(e)d0 

is a probability. We call {'nxoXi G ^F-, A > 0} the 
Diaconis- Ylvisaker family associated to the NEF F. 
We now reparameterize it by the mean. More specif- 
ically, denote by 

J^xoAdm) = C{xo, X) exp X{xoip^{m) - k^{'4)^{m))) 
■ V'|t(m)lA/j.(m)(im 

the image of T^xo,\{d9) by 9 ^ m = k'^{9). Finally 
consider the distribution on defined by 

a{dx,dm) = P(/i, '0^(m))(dx)z^a;(,,A(d"i). 

Note that the marginal distribution fii{dx) of a{dx, 
dm) does not belong to F except in the normal case. 
(Proving this is an amusing exercise. It even holds 
when the reference measure d9 in the 
Diaconis- Ylvisaker family is replaced by any other 



positive measure.^) The second observation of the 
paper, and a quite original one, is that a{dx,dm) is 
a Lancaster probability if F is either binomial (Sec- 
tion 4.1), or Poisson (Section 4.2), or Gaussian (Sec- 
tion 4.3). An element of the Diaconis- Ylvisaker fam- 
ily associated with the binomial case B{9,n) is the 
beta distribution i'i{d9) = [ia,b{d9) and the marginal 
distribution of X is the so-called hyper geometric dis- 
tribution 

(1) m(rfx) = g(^^j^^^<^,(<ix). 

The construction of a Lancaster probability with 
these margins (/ii,i^i) have never been done before. 
Here the Lancaster sequence is pj = n\/ {a + h + n)j{n — 
j)l for < J < n and pj = if n < j. The Lan- 
caster probabilities obtained for F = Poisson and 
F = Gaussian are familiar and are mentioned in 
Koudou (1996, Section 3.3) and studied in Koudou 
(1995). 

My guess is that these 3 types of NEF are the only 
ones with such a property: this is obviously false for 
the three other quadratic NEF (Negative binomial, 
gamma, hyperbolic), for which Vxo,\{dm) has very 
few moments. The reader can check for example that 
the same is true for the NEF generated by a stable 
law of parameter a G (0,1) concentrated on (0,oo) 
and defined by A;^(6') = — c(— 0)": recall that a = 1/2 
gives the celebrated Inverse Gaussian distributions 
(the case a G [1,2) has not to be investigated since 
it is not steep). 

In order to explain the content of the third ob- 
servation of the paper, we introduce the Jorgensen 
set A(/u) of ^ G A1(M). It is the set of A > such 
that for A > there exists n\ G A1(M) such that 
Q{p.\) = &{p) and such that L^^ = {L^)'^. We im- 
pose G A(/i). For instance A(/i) = [0, oo) if and only 
if F{p?j is made of infinitely divisible distributions. 
On the other hand A(/i) is the set of nonnegative 
integers if p = 6q + 6i, namely if F is the Bernoulli 
family. In general A(/i) can be a quite complicated 
additive semigroup: see Letac, Malouche and Mau- 
rer (2002) for its description when p is the con- 
volution of a negative binomial distribution with a 



^The family G obtained in this way is also a conjugate 
family to F, which means that the a posteriori distribution 
■n:{d6\x) is in G when the a priori distribution vr is in G. For 
this reason we do speak of the Diaconis- Ylvisaker family in- 
stead of the conjugate family of the paper, even if the later 
has the characteristic property mentioned in Section 2.3.2. 
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Bernoulli distribution. Now consider /x G M (M) and 
A and rj in A(;u). Let 

{x,Y)^p{fix,e)^p{fir„e). 

Write 5 = X + y ~ Pif^x+'q, 0) (the distribution of 
Y knowing 5 does not depend on 9) and denote 
by dy) the joint distribution of (5, y). The au- 
thors observe that, when F happens to be a quadratic 
NEF, 0" is a Lancaster probability: this is the essence 
of Section 5. However, this is a particular case of 
the following classical result mentioned in Eagleson 
(1964): suppose that A,r/,^ are in A(^) and let 

(X, y, z) ~ p(pA, 0) P{^lr,,e) p{^i^,9). 

Denote by a{ds, dt) the joint distribution of {S,T) = 
{X + Y,Y + Z). Then cr is a Lancaster probability 

if F is a quadratic NEF. More specifically if (p^^) 
is the sequence of the orthonormal polynomials for 
P{fix,9) and if l/c„(A) is the positive square root of 
the coefficient of x"' in pi^^ the corresponding Lan- 
caster sequence is 

(2) ^"(^) 

Thus Section 5 is based on the particular case A = 
ni, = 712, ^ = of this result. 

3. FINDING ALL LANCASTER FAMILIES 
WITH GIVEN MARGINS 

Given a pair of probabilities (/i, u) on M such that 
/ e"l^l/i((ix) and / e"'^y^v[dy) are finite for some a > 
0, consider the set -L(/i, v) of Lancaster probabilities 
a with margins (/x, u) and the set S{^, u) of cor- 
responding Lancaster sequences p= {pn)^=o- They 
are isomorphic compact convex sets which are com- 
pletely known if we know their extreme points. We 
denote by I{p) the smallest closed interval / such 
that /i(/) = 1. We consider several cases: 

Case A. I{fi) is bounded, /(z^) is unbounded. 

Case B. /(/i) = M and is a half- line. 

Case C. llijl)=I{u)=R. 

Case D. /(p) and /(z>) are half-lines. 

Case E. I{fi) and are bounded. 

Cases A and B are easy: the only Lancaster prob- 
ability is the product measure. Denote by a„ > 
and by 6„ > the coefficients of x" in the orthonor- 
mal polynomials pn and Qn- Case C is quite interest- 
ing: from a remarkable result of Tyan and Thomas 
(1975), extending an idea of Sarmanov and Bratoeva 



(1967), which says that if 7 = liminf (a2n/&2n)^/^" 
and if p G ^(/i, z^), there exists a probability a{dt) 
on [—7,7] such that anPn/bn = J2.yt^0i{dt). Simi- 
larly in the case D, assuming without loss of gener- 
ality that I{p) and /(i^) are positive half-lines and 
if 7 = liminf(a„/6„)"'^/" then there exists a probabil- 
ity a{dt) on [0,7] such that anPn/bn = Iot^O!{dt). 
The results of Tyan and Thomas (1975) can also es- 
sentially be found again in Tyan, Derin and Thomas 
(1976) and have been rediscovered by Christian Berg, 
quoted in Ismail (2005, page 114) who does not seem 
to be aware of this previous work. 

We shall speak about case E later on. Note that 
for p = v the results by Tyan and Thomas are quite 
exciting since they mean that a Lancaster sequence 
must be the moment sequence of a probability either 
on [—1, 1] (case C) or on [0, 1] (case D). If we are for- 
tunate enough to prove that pn = is a Lancaster 
sequence for all t G [—1, 1] (case C) or all t G [0, 1], 
by the theorems of Tyan and Thomas, we have a 
complete description of the Lancaster probabilities 
L{p,p) since they are parameterized by the proba- 
bilities a on [—1,1] or on [0, 1]. Interestingly enough, 
this is known to happen only for 4 types of p: Gaus- 
sian, Poisson, negative binomial and gamma. The 
corresponding Lancaster probabilities (see Bar-Lev 
et al., 1994) are the only ones which belong to a two- 
dimensional natural exponential family with vari- 
ance function of the form 

a(mi) /(mi, 7712)" 
/(mi, 7712) a(m2) 

More specifically one can conjecture the following: 

• If I{p) = R and if (t") is in S{p,p) for all t G 
[—1,1] then p is Gaussian. 

• If I{p) = [0,00) and if (t") is in S{p,p) for all 
t G [0, 1] then p is either gamma, or Poisson, or 
negative binomial. 

In the gamma case, it is interesting to consider 
the classical two-dimensional distribution of Kibble 
(1941) and Moran (1967) with correlation r G [0, 1] 
and Jorgensen parameter q. It can be defined by its 
Laplace transform 

/•oo roo 

/ / e-'''-*yar{dx,dy) 
Jo Jo 

= {l + s + t + {l-r)str'^. 

As observed by D'jachenko (1962), this probability 
is actually a Lancaster probability with pn = r^, and 
thus an extremal one (the last three references are 
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taken from Johnson and Kotz, 1972, pages 479-482). 
This means that in general cr is a Lancaster proba- 
bihty for the gamma margins ^ = v = if and only 
if it is a mixing of Kibble and Moran distributions, oo 
which means that there exists a probability distri- (3) K{x, y, z) = ^ 
bution a{dr) on [0, 1] such that 



richest case. For future research, it is the most chal- 
lenging. If /^i = z/ suppose that there exists xq such 
that |pn(2;)| l!iVn[xQ) fi almost surely, and consider 



1 



Pn{x)Pn{y)Pn{z). 



oo roo 



JO 



-sx—ty 



a{dx, dy) 



= [ {1 + s + t+ {1 -r)st) ''a{dr). 
Jo 

Take for instance a{dr) = /3^^q_^((ir) to get back (2) 
for the gamma case and A = ^ = g — 77. 

For the cases C and D and for v not an affine 
transformation of /x, there is no known example where 
the set of the extreme points of i') can be com- 
pletely described. Koudou (1995, 1996) has shown 
that pn = is a Lancaster sequence: 

• for fi = Pa and u = P}, {Pa means Poisson distri- 
bution with mean o) for < i < (a/6)^/^ if a < 6; 

• for ^ = NBa,\ and v = NBa^x [the negative bino- 
mial distribution NBa,x is (1 - a)^Y.n=o • 
5„(dx)] for < t < (a/5)i/2 if a < 6; 

• for // = NBa^x and = 7;^ for < t < a^/^. 

In these three cases, one can conjecture that one has 
obtained all the extreme points of S{fi,i'). 

Consider a hyperbolic distribution fig as described 
in Section 2.4 and simply defined by L^^{9) = 
{cos9)~'^ with q > and Q{fiq) = (— f > f )• Lai and 
Vere-Jones (1975) have proved that (t") is never in 
S{fiq,fiq) (an other proof is in Bar-Lev et al., 1994). 

Formula (2) applies here with Cn{q) = -^f^- In (2) we 
take < 1] < q and X = ^ = q — rj to show that the 
sequence 



- — - et'^-^ii-ty-'^-^dt 

q-v) Jo 



Cn{q) B{r],q-r])Jo 

is an element of S{fiq,pq). This illustrates the results 
of Tyan and Thomas with a{dt) = (dri^q-nidt). One 
can conjecture (as done by Lai and Vere-Jones for 
q = I) that such a Lancaster sequence indexed by 
T] £ [0,q] is an extreme point of S{fiq,iiq), and that 
all extreme points are of this type. 

4. THE CASE WHERE n AND u HAVE 
BOUNDED SUPPORT 

This is the case E above. For the variety of re- 
sults already obtained in the literature, this is the 



Koudou (1995) has shown that K >0 for almost 
all (x, y, z) in the /i sense implies that the extreme 
points of S{ii,fi) are defined by p„ = Pn{x) / pn{xo) 
when X describes the support of ji. This extends a 
remarkable paper by Eagleson (1969) devoted to the 
case where fi is discrete with finite support, where 
it is shown in particular that K > when /x is a 
binomial distribution. As mentioned in the paper, 
the analysis by Koudou (1996) of Gasper's (1971) 
delicate results shows that K >0 when /i = f3afi is a 
beta distribution such that a,b> 1/2 [note that the 
case min(a, 6) < 1/2 is open]. 

The particular case a = 5 > 1/2 deserves a special 
mention. Using the transformation x 2x — 1, we 
first move the distributions from [0, 1] to [—1, 1] and 
we introduce 



A{x,y,z) 



1 



y 



+ 2xyz. 



For — 1 < z < 1 we consider the plane domain Uz = 
{(x,y); A > 0}. This domain is limited by an ellipse 
Ez tangent to the sides of the unit square [—1,1]^. 



Denote fia{dx) 



[1 — X 



(x) dx. The 



B{a,a) ' 

number xq involved in the definition of K in (3) is 1, 
and the polynomials pn are the Jacobi polynomials 
with suitable parameters and normalized such that 
they become orthonormal with respect to pa- With 
these notation, K is zero outside of Uz and is equal 
to 

Ka{x,y,z) 

= C{a)[{l - x^){l - y^){l - z2)]i-«A'^-3/2 

in Uz- The important point is the following. For 
z € (—1,1) consider the extremal Lancaster proba- 
bilities az{dx,dy) = Ka{x,y, z)na{dx)fiaidy)- These 
Lancaster probabilities az are the only ones (to- 
gether with the centered nonsingular Gaussian dis- 
tributions with covariance of the type ^ |^ ) to be 

elliptically contoured. More specifically, let E = 'M? 
have the Euclidean structure such that Uz is the unit 
disk. Saying that cr^ is elliptically contoured means 
that (T2 is invariant by the orthogonal group 0{E) 
of this Euclidean structure. This characterization is 
the consequence of an elegant result of McGraw and 
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Wagner (1968). While most of the "results" about 
elliptically contoured distributions in are triv- 
ially reduced to considerations about rotational in- 
variant distributions, this is not the case here. The 
reason is that the canonical basis of is struc- 
turally important for Lancaster probabilities. In the 
other hand this canonical basis is not orthonormal 
for the Euclidean structure associated with a given 
elliptically contoured distribution and this makes at- 
tractive the McGraw and Wagner result. 
Koudou (1995) shows that we have K >0 when 



^^(dx) = — ^l(-p,p)(a;) dx 

where q> and p = 2^/ {1 + q). When q is an in- 
teger this strange probability is the Plancherel mea- 
sure of the Gelfand pair associated to the homo- 
geneous tree where every vertex has q + 1 neigh- 
bors. The corresponding polynomials are called the 
Cartier-Dunau polynomials in the literature (see Ar- 
naud, 1994). A general theory of the probabilities /i 
with bounded support such that the function K of 
(3) is positive could be a subject of research. As an 
example, I do not know whether K >0 ot not when 
fj, is the hypergeometric distribution (1) considered 
in the paper, where the orthonormal polynomials 
are the Hahn polynomials. 

When fj, and v are two probabilities with bounded 
support such that i/ is not an affine transform of fi, 
the search of extreme points of the Lancaster mea- 
sures does not seem to have been done for any exam- 
ple. Suppose that we have found some p £ S{fi,i'). 
A good way to create other elements of S{n, u) is to 
pick a G 5(//, p) and b G S{i'^ u). It is easy to see that 
{anPnbn)^=o IS also in S{p, u). Applying this remark 
to the interesting pair {ij,i,ui) defined by (1) and to 
the new Lancaster sequence pj = n!/(a + 6 + n)j {n — 
j)! discovered by the authors would lead to a better 
understanding of S{pi,i'i). 

5. CONCLUSION 

We referred to several bright papers by Eagle- 
son, Koudou, McGraw and Wagner or Tyan and 
Thomas, and to a genuine masterpiece by Gasper. 
Many stimulating questions and conjectures remain, 
regarding in particular special functions and group 
theory through the function K. The present paper 
shows us how unexpectedly these bivariate probabil- 
ities can be important for very practical questions: 
it will be in turn a new landmark of the theory of 
Lancaster probabilities. 
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